testingiauh912fandomcom-20200214-history
Reliability
Reliability: Research requires dependable measurement. (Nunnally) Measurements are reliable to the extent that they are repeatable and that any random influence which tends to make measurements different from occasion to occasion or circumstance to circumstance is a source of measurement error. (Gay) Reliability is the degree to which a test consistently measures whatever it measures. Errors of measurement that affect reliability are random errors and errors of measurement that affect validity are systematic or constant errors. Test-retest, equivalent forms and split-half reliability are all determined through correlation. Test-retest Reliability: Test-retest reliability is the degree to which scores are consistent over time. It indicates score variation that occurs from testing session to testing session as a result of errors of measurement. Problems: Memory, Maturation, Learning. Equivalent-Forms or Alternate-Forms Reliability: Two tests that are identical in every way except for the actual items included. Used when it is likely that test takers will recall responses made during the first session and when alternate forms are available. Correlate the two scores. The obtained coefficient is called the coefficient of stability or coefficient of equivalence. Problem: Difficulty of constructing two forms that are essentially equivalent. Both of the above require two administrations. Split-Half Reliability: Requires only one administration. Especially appropriate when the test is very long. The most commonly used method to split the test into two is using the odd-even strategy. Since longer tests tend to be more reliable, and since split-half reliability represents the reliability of a test only half as long as the actual test, a correction formula must be applied to the coefficient. Spearman-Brown prophecy formula. Split-half reliability is a form of internal consistency reliability. Rationale Equivalence Reliability: Rationale equivalence reliability is not established through correlation but rather estimates internal consistency by determining how all items on a test relate to all other items and to the total test. Internal Consistency Reliability: Determining how all items on the test relate to all other items. Kudser-Richardson-> is an estimate of reliability that is essentially equivalent to the average of the split-half reliabilities computed for all possible halves. Standard Error of Measurement: Reliability can also be expressed in terms of the standard error of measurement. It is an estimate of how often you can expect errors of a given size. — Reliability refers to a measure’s ability to capture an individual’s true score, i.e. to distinguish accurately one person from another While a reliable measure will be consistent, consistency can actually be seen as a by-product of reliability, and in a case where we had perfect consistency (everyone scores the same and gets the same score repeatedly), reliability coefficients could not be calculated. No variance/covariance to give a correlation The error in our analyses is due to individual differences but also the lack of the measure being perfectly reliable · Criteria of reliability Test-retest Test components (internal consistency) · Test-retest reliability Consistency of measurement for individuals over time The score similarly e.g. today and 6 months from now · Issues v Memory v If too close in time the correlation between scores is due to memory of item responses rather than true score captured · Chance covariation v Any two variables will always have a non-zero correlation · Reliability is not constant across subsets of a population v General IQ scores good reliability v IQ scores for college students, less reliable v Restriction of range, fewer individual differences